Genome assembly, comparative genomics, and identification of genes/pathways underlying plant growth-promoting traits of an actinobacterial strain, Amycolatopsis sp. (BCA-696)

The draft genome sequence of an agriculturally important actinobacterial species Amycolatopsis sp. BCA-696 was developed and characterized in this study. Amycolatopsis BCA-696 is known for its biocontrol properties against charcoal rot and also for plant growth-promotion (PGP) in several crop species. The next-generation sequencing (NGS)-based draft genome of Amycolatopsis sp. BCA-696 comprised of ~ 9.05 Mb linear chromosome with 68.75% GC content. In total, 8716 protein-coding sequences and 61 RNA-coding sequences were predicted in the genome. This newly developed genome sequence has been also characterized for biosynthetic gene clusters (BGCs) and biosynthetic pathways. Furthermore, we have also reported that the Amycolatopsis sp. BCA-696 produces the glycopeptide antibiotic vancomycin that inhibits the growth of pathogenic gram-positive bacteria. A comparative analysis of the BCA-696 genome with publicly available closely related genomes of 14 strains of Amycolatopsis has also been conducted. The comparative analysis has identified a total of 4733 core and 466 unique orthologous genes present in the BCA-696 genome The unique genes present in BCA-696 was enriched with antibiotic biosynthesis and resistance functions. Genome assembly of the BCA-696 has also provided genes involved in key pathways related to PGP and biocontrol traits such as siderophores, chitinase, and cellulase production.

all of them were complete and just three of them were duplicated.Only one BUSCO was fragmented, indicating a high degree of completeness of the generated assembly.

Pan genome of Amycolatopsis species & genes specific to BCA-696
In order to infer the genomic similarity or variation among the Amycolatopsis species, a pan-genome analysis was conducted.A pan-genome, comprising all genomes of the Amycolatopsis genus with a scaffold or higher level assembly (n = 76) demonstrated huge diversity in terms of gene composition: out of a whopping 375,299 orthologous groups, core genes were a much smaller set (< 1%) compared to the cloud genes (> 95%) (Supplementary Fig. S1).Since our focus was on genomic regions unique to Amycolatopsis sp.BCA-696, a subset comprising this genome and the genomes of fourteen closely related species/strains, all from a single clade within the pan-genome-based tree, were re-analyzed.A total of 35,318 orthologous gene clusters could be divided into a core genome of 3,627 (41.6%) orthologous gene clusters, having more than 99% similarity present across all fifteen strains, and the unique genes which ranged from 654 (in A. keratiniphila) to 2557 genes (in Amycolatopsis coloradensis) (Supplementary Fig. S2).The genome of Amycolatopsis sp.BCA-696 has 1423 (16.3%) strain-specific genes.Since as many as 1/6th (n = 1423) of total genes were reported as unique, this required a rigorous evaluation by using an algorithm that establishes orthology very accurately.Orthology search using Orthofinder pipeline 24 reported a core set of 4,733 (42.6%) genes and 466 (4.2%) unique genes in Amycolatopsis sp.BCA-696 genome (Fig. 3).Out of these 466 unique genes, only 53 genes could be functionally annotated(by RAST pipeline 25 and Reciprocal Best Blast) (Supplementary Table S2).Among these 53 unique genes, one gene (Genbank ID: WYW18527.1)has been found to be involved in the biosynthesis of Bialaphos antibiotic (carboxyvinyl-carboxyphosphonate phosphorylase).Previously this antibiotic was reported only in Streptomyces species, possessing bactericidal, fungicidal, and herbicidal properties 26 .Other  2), 3rd layer shows an average read depth in a 5 kb window along the genome, 4th layer shows GC skew (positive values in blue and negative values in brown color), 5th layer shows rRNAs (blue bands), tRNAs (red bands), and CRISPR spacers (black bands), and 6th layer shows similar repeats on the genome connected by lines (generated using Circos V.0.69.8).

Phylogenetic relationships of BCA-696
To establish the taxonomic positioning of BCA-696 within the Amycolatopsis genus, we used methods based on the overall genome relatedness index, which showed that Amycolatopsis sp.BCA-696 genome was closer to the genomes of A. lurida strains than any other Amycolatopsis genomes (Fig. 4).But the bootstrap value for the branch leading to BCA-696 was poor (~ 50 out of 100) showing the uncertainty in the position of BCA-696.Since the previous taxonomic assignment of this strain was based on 16S rRNA sequence, the availability of draft genome information of the strain has positioned it in the phylogenetic tree equidistant to A. lurida and A. roodepoortensis without being closer to either of them, suggesting this strain might be a new species.S3, Supplementary Fig. S3).The commonly observed BGC classes were 'Resistance' , 'Tailoring' , 'Thiotemplated' , 'Type II polyketide' , 'ribosomally synthesized and post-translationally modified peptide product' (RiPP), 'phosphonate' , etc. Whether any of the genes unique to BCA-696 overlap with these BGCs, a comparison of their genomic coordinates did not show any overlap.
A detailed examination of core and additional biosynthetic genes showed two large clusters for the biosynthesis of two antibiotics-vancomycin and enediyne, spread within genomic coordinates 36,435-96,579    S3).A full pathway for vancomycin biosynthesis was observed, however, for the Enediyne family of antibiotics, pathways were observed for the biosynthesis of its core molecule (neocarzinostatin) necessary for further reactions, and one of its derivatives, maduropeptin.

Genes underlying PGP and bio-control traits/pathways
The Amycolatopsis sp.BCA-696 genome assembly was analyzed to identify the genes involved in PGP and biocontrol activities.The strain under this study was experimentally validated to produce metabolites such as siderophores and hydrocyanic acid and shows enzymatic activity for cellulase, chitinase, lipase, indole acetic acid, and 1,3-beta-glucanase 16 .Among the siderophores, evidence for the presence of complete biosynthetic pathways for the production of catecholate or mixed types siderophores namely, bacillibactin, enterochelin, mycobactin, etc., were examined in the genome annotation of Amycolatopsis sp.BCA-696.In the RAST annotation, while the enzymes for biosynthesis of siderophore precursors and transporters involved in the export/import of siderophore or Fe-siderophore complex were found (Supplementary Table S4), the genes for biosynthesis of these siderophores from their precursors were largely missing in the (RAST) annotation.Orthology-based search only showed the presence of partial pathways (Supplementary Table S5, Supplementary Fig. S4).
Examination of biosynthetic genes/enzymes for auxin (IAA) showed the presence of the pathway involving the Tryptamine intermediate.For alternate pathways (for auxin biosynthesis), although a few genes were also present, ultimately the pathways were incomplete (Supplementary Table S9, Supplementary Fig. S8).

First genome assembly of an agriculturally important species from the Amycolatopsis genus:
Amycolatopsis genus is well documented for its antibiotic-producing traits 28,29 and thus is an obvious focal point for drug discovery programs.To understand the genomic basis behind this(i.e., secondary metabolite production), 150 Amycolatopsis genomes have so far been sequenced (based on data accessed in mid-2023) and about one-fifth of them are either complete or at the chromosome level assembly.The importance of Amycolatopsis in the agricultural sector is, however, not well known.Perhaps, we were the first to report its usefulness as PGP in sorghum and chickpea 15 , and its antagonistic potential against M. phaseolina mediated charcoal rot disease in sorghum 16 .Along a similar line, to uncover the genomic components underlying the PGP/biocontrol traits, we reported chromosome-level genome assembly of BCA-696 strain: ~ 9.06 Mb in size with 8,716 protein-coding genes (Fig. 1, Table 1).

Phylogenomic analysis indicated that Amycolatopsis sp. BCA-696 is a species on its own
The taxonomic classification of the BCA-696 strain has been available only till the genus level.The analysis using the whole genome sequence showed that this strain is closer to A. lurida (Fig. 4).However, the poor bootstrap value for the branch of BCA-696 implied that the species-level classification of BCA-696 is still elusive with the existing genomic information.

Like several Amycolatopsis genomes, the Amycolatopsis sp. BCA-696 too had many unique genes
The Amycolatopsis genomes, in particular the BGCs, have been compared in multiple studies 22,29,30 .A comparison involving 41 genomes showed a core set of size 1212 genes 29 , which was almost one-third of the size observed in this study (n = 4733), and the difference can be attributed mainly to the different number of genomes involved (41 versus 15).Given the diversity in the biosynthetic potential of secondary metabolites among Amycolatopsis species, the accessory and unique (gene) sets attain a higher importance than the core set.While comparing the Gene Cluster Families (GCFs) in 41 genomes, the conserved features accounted for only a small proportion, and a vast number of GCFs (67%) were represented by a single genome 29 .In the BCA-696 strain, a prediction of unique features gave 466 genes (Fig. 3), which corroborates earlier reports of a significant fraction of unique genes present in Amycolatopsis genomes.

Unique genes of Amycolatopsis sp. BCA-696 were enriched with antibiotic biosynthesis and resistance functions
While the functional annotation of the majority of the unique genes was unavailable, among the annotated genes antibiotic-related functions were the most prominent.It contained the gene for the biosynthesis of Bialaphos antibiotic (Supplementary Table S2), which has not been reported yet among Amycolatopsis species, and just one Streptomyces species has been known to have capability 31,32 .This antibiotic has been reported to have fungicidal and herbicidal roles, thus a potential biocontrol agent.In addition, several antibiotic transporters were also www.nature.com/scientificreports/found which may play a role in secretion or defense (Supplementary Table S2).Another peculiar feature was the presence of Chloramphenicol and Streptomycin phosphotransferase which are involved in providing resistance against those antibiotics (Supplementary Table S2).

The genes identified for PGP traits indicated inhibition of M. phaseolina potentially due to the hydrolytic enzymes and antibiotics
Plant-growth-promoting bacteria isolated from the rhizosphere are known to produce growth hormones such as auxins and siderophores and hydrolytic enzymes such as chitinase, cellulase, and β-1,3-glucanase and help plants to inhibit pathogens either directly or indirectly [33][34][35] .Amycolatopsis BCA-696 has been reported to produce biocontrol and PGP traits including siderophore, HCN, chitinase, protease, cellulase, β-1,3-glucanase, lipase, and IAA under in vitro conditions 15 .The genome sequence analysis indicated that this strain can potentially biosynthesize auxin only through the Tryptamine pathway (Supplementary Table S9) and may need to cooperate with other rhizosphere bacteria for alternate biosynthesis pathways.Moreover, for Cellulase, Lipase, and Chitinase, complete biosynthesis pathways were identified (Supplementary Table S6-S8).Besides, complete or partial biosynthesis pathways for the diverse siderophore molecules were identified along with a number of transporters (Supplementary Tables S3-S5).The presence of biosynthetic genes for the Enediyne family of antibiotics (Supplementary Table S3), which typically act by DNA cleavage 36 , has been an interesting finding in BCA-696, as some of these secondary metabolites have earlier been shown for antifungal activity among diverse plant and animal pathogens 37,38 .Hence, it is concluded that Amycolatopsis BCA-696 potentially produces hydrolytic enzymes or antibiotics that have the potential to inhibit the pathogen M. phaseolina that causes charcoal rot in sorghum.

Future Directions
The usefulness of Amycolatopsis BCA-696, for biocontrol of charcoal rot disease in sorghum, has been demonstrated at both greenhouse and field conditions in our previous study 16 .We also reported the tolerance of BCA-696 on a wide range of pH (5-11), temperatures (20-40 °C), NaCl concentrations (0-6%), and fungicides (including Bavistin up to 2500 ppm, Thiram up to 3000 ppm, Benlate up to 4000 ppm, Captan up to 3000 ppm and Ridomil up to 3000 ppm 15 .These traits could help BCA-696 to survive in harsh environments under natural conditions and thus this bioagent can be used in integrated disease management programs.Further, Amycolatopsis BCA-696 needs to be formulated as a bio-inoculant and used for the biocontrol of charcoal rot in other crops.The secondary metabolite(s) responsible for the inhibition of M. phaseolina need to be experimentally characterized.
In the absence of a high level of genetic resistance in high-yielding varieties, Amycolatopsis BCA-696 could be effective in controlling charcoal rot disease and related loss in grain and stover quality of sorghum.

Source of actinobacterial strain
A strain of Amycolatopsis sp.BCA-696 (GenBank accession number of 16S ribosomal RNA gene: KM191337), previously reported by us to have the capacity for PGP in sorghum and chickpea 15 and for its antagonistic potential against Macrophomina phaseolina that causes charcoal rot disease in sorghum 16 was selected for the present study.

Isolation of DNA
DNA of Amycolatopsis sp.BCA-696 was isolated as per the protocols mentioned in Gopalakrishnan et al., 2020 14 .In brief, BCA-696 was inoculated in starch casein broth and incubated for 120 h at 28° C. At the end of incubation, the culture was centrifuged at 10,000g for 10 min at 4 °C and the cells were washed twice with STE buffer (sucrose 0.3 M, Tris/ HCl 25 mM and Na 2 EDTA 25 mM, pH 8.0).The supernatant was discarded but the pellet (1 g) was re-suspended in 8.55 ml STE buffer and 950 µl lysozyme (20 mg/ml STE buffer) and incubated for 30 min at 30 °C.This was followed by the addition of 500 µl of SDS (10%; w/v) and 50 µl of protease (20 mg/ ml) and the mixture was held at 37 °C for one h.At the end of incubation, 1.8 ml of NaCl (5 M) was added with gentle mixing to avoid shearing the DNA, and 1.5 ml of CTAB (10%; w/v) in 0.7 M NaCl (CTAB/NaCl solution) and incubated for 20 min at 65 °C.Once CTAB was added, all the remaining steps were carried out at room temperature.The lysate was extracted twice with an equal volume of phenol/chloroform/isoamyl alcohol (25:24:1; by vol) and centrifuged at 13,000g for 10 min.Finally, the aqueous phase was extracted with chloroform/isoamyl alcohol (24:1, by vol) and transferred to a tube followed by the addition of 600 µl of propan-2-ol and DNA spooled out after 10 min.Alternatively, it was recovered by centrifugation at 12,000g for 10 min.The pellet was washed twice with ethanol (70%; v/v), vacuum dried, and dissolved in 2 ml of TE buffer (10 mM Tris/HCl and 1 mM EDTA, pH 8.0).RNase A (50 mg/ml) was added with incubation at 37 °C for 2 h.The sample was again extracted with phenol as described above.DNA was re-precipitated from the aqueous phase with the addition of 100 µl of 3 M sodium acetate (pH 5.3) and 600 µl of propan-2-ol.The DNA pellet was washed with ethanol (70%; v/v), dried, and dissolved in TE buffer.The purity of the BCA-696 DNA was checked in the agarose gel electrophoresis and quantified using NanoDrop.

Whole genome sequencing
About ~ 5 μg high-quality genomic DNA (free from any contaminant and having A260/280 ratio in the range of ~ 1.8 to 2.0 with DNA concentration ≥ 100 ng/μl) was sent to AgriGenome Labs (Kochi, India) for library preparation and next-generation sequencing using the Illumina platform.The Genomic DNA was fragmented, and a paired-end library with insert size 300 bp, and a mate-pair library with insert size 5 Kbp, were prepared.

Figure 1 .
Figure 1.Visualization of different features on the Amycolatopsis sp.BCA-696 genome.Coding regions on forward and reverse strands are shown as 1st and 2nd concentric circular layers from outside(the colors differentiate the gene categories shown in Fig. 2), 3rd layer shows an average read depth in a 5 kb window along the genome, 4th layer shows GC skew (positive values in blue and negative values in brown color), 5th layer shows rRNAs (blue bands), tRNAs (red bands), and CRISPR spacers (black bands), and 6th layer shows similar repeats on the genome connected by lines (generated using Circos V.0.69.8).

Figure 3 .
Figure 3. Flower plot showing Core genes and unique genes between fifteen closely related Amycolatopsis strains based on Orthofinder results.The number of core genes, shared by all of them, is shown in the center of the flower plot, and the unique genes for each strain are as flower petals (generated using rstudio using plotrix V.3.8-4)(for the list of type strains refer to "Materials and methods").

Figure 4 .
Figure 4. Phylogenetic tree obtained from TYGS server comparing genomes of Amycolatopsis BCA-696 with several closely related species/strains available in the online server (The type strains are indicated with a "T" in superscript).The TYGS analysis differed slightly from the results of Roary and Phylophlan3, showing A. roodepoortensis diverging from the common ancestors of A. lurida and Amycolatopsis sp.BCA-696.The bootstrap values (out of 100 iterations) are shown for each branch (generated from TYGS server https:// tygs.dsmz.de/).

Figure 5 .
Figure 5. Physical maps of predicted biosynthetic gene clusters (BGCs) in the genome of Amycolatopsis sp.BCA-696.Arrows indicate the direction of gene transcription, while the chemical classes of products of the biosynthetic pathways encoded by the genes are indicated with different colors.The positions in the genome and the maps were generated from PRISM4 (https:// prism.adaps yn.com/). https://doi.org/10.1038/s41598-024-66835-y